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ABSTRACT 


A methodology for performing fault-tolerant system reliability analy- 
sis is presented. The method decomposes a system into its subsystems, 
evaluates event rates derived from the subsystem's conditional state 
probability vector and incorporates those results into a hierarchical 
Markov model of the system. This is done in a manner that addresses 
failure sequence dependance associated with the system's redundancy man- 
agement strategy. The method is derived for application to a specific 
system definition. Results are presented that compare the hierarchical 
model's unreliability prediction to that of a more complicated standard 
Markov model of the system. The results for the example given indicate 
that the hierarchical method predicts system unreliability to a desira- 
ble level of accuracy while achieving significant computational savings 
relative to a component- level Markov model of the system. 
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CHAPTER 1 ■ INTRODUCTION 


Lai Background 

Analytic reliability modeling is required to support the design and 
validation of highly reliable fault-tolerant systems (Ref {1}) . The most 
accurate method for evaluating system reliability involves the life 
testing of many systems. Reliability statistics are subsequently 
derived from the test data. However, for large, highly reliable complex 
systems this is impractical due to the time and expense involved in 
testing such a system. Consequently it is necessary to analytically 
model the reliability of large highly reliable systems. 

There are two general approaches to system reliability modeling, both 
of which rely upon knowledge of component failure rates (that can be 
found through life testing). First, there are combinatorial methods 
(Ref {2}, {3}) in which component reliabilities and unreliabilities are 
used in conjunction with an enumeration of failure events to arrive at a 
system reliability prediction. Secondly, there is the Harkov model 
approach (Ref {4}) . This approach associates failure combinations with 
states of a Harkov chain wherein component failure rates define transi- 
tion rates. 
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1 x2 Motivation 

Both approaches encounter difficulties with the evaluation of large 
complex systems. A common combinatorial method is the event space meth- 
od. This method utilizes an enumeration of combinations of failed and 
unfailed components. Each of these combinations defines the condition 
of every component in the system so that combinations are mutually 
exclusive and hence the probabilities are additive. For a large complex 
system, the large number of components wi 1 1 lead to many complex terms, 
and thus a large algebraic expression to be evaluated. 

* 

This situation can be mitigated somewhat through the use of struc- 
tural decomposition (Ref {5}) . Structural decomposition consists of 
dividing a system into smaller independent subsystems (redundancy man- 
agement is local to the subsystem) , analyzing the subsystems and - then 
combining the subsystem results combinatorial ly to obtain the results 
for the system. However, this approach breaks down when the time 
sequence of events is an issue. 

There are two types of sequence dependencies that arise in the evalu- 
ation of fault-tolerant systems. First, there are time-ordered event 
sequences associated with false alarms. This problem has been investi- 
gated by Luppold et al (Ref {6}). Second there are time-ordered event 
sequences that result from the system's redundancy management policy. 
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In particular, Schabowsky et al (Ref {5}) show how system reconfiguration 
can introduce time-ordered event sequences which greatly complicate a 
combinatorial approach. 

In order to demonstrate the problem of time ordering of events 
resulting from a reconfiguration strategy, the unreliability of an exam- 
ple system (described in Chapter 2) is examined using three methods. 
These methods are: the event-space method, a component- level Markov 
model, and the hierarchical approach that is explored in this study. 
The results in Figure 1.1 clearly show that the absence of time-ordering 
considerations in the combinatorial analysis leads to a conservative 
unreliability prediction. This is due to the fact that some operational 
states are actually considered non-operational states in the event space 
method while they are accurately represented by the other two methods. 

Although a Markov model approach conveniently handles time ordered 
events, this approach encounters difficulty when a system with a large 
number of components is considered. With many components (and thus many 
combinations of failures to be considered) the Markov model may have a 
very large state space. A large state space translates into a large 
system of differential equations that must be solved to produce an unre- 
liability prediction. The proposed hierarchical approach mitigates the 
problem of state proliferation while still addressing the problem of 
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o HIERARCHICAL * COMPONENT-LEVEL □ COMBINATORIAL 



TIME (hrs) 


Figure 1.1 

Motivating Example 
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sequence dependancy, and furthermore, produces state history probabili- 
ties for the system whereas the combinatorial approaches do not. 

1 .3 Methodology 

The underlying concept of the hierarchical approach is that of aggre- 
gating component- level Markov model states into hierarchical states. 
The associated hierarchical state transitions are often time-varying to 
reflect multiple (non-simul taneous) failures within a given subsystem. 
These transitions correspond to subsystem- level events such as perform- 
ance degradation and subsystem loss, and in turn usually imply system- 
level performance degradation. Consequently, the hierarchical model 
becomes semi -Markov because the holding time probability density func- 
tions for each hierarchical state are generally no longer the same for 
all transitions from a particular hierarchical state to other hierarchi- 
cal states. In fact, the holding time distributions in the the hierar- 
chical states are often Erlang (Ref {7}) (or more general) and this leads 
to a problem. For a semi-Markov chain where the failure rate from state 
i to state j can be expressed as (t) and k i (t) ■ I x^U), 

t t 

-f X.(T)dt t -J X.(T)dt 

P.(t) = P.(0)e° 3 + l L P.(x)X (x)e 3 dx 

J D i*j ° 1 ij 

( 1 . 1 ) 
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(Ref {8}) . Even for a small number of states this is a computational 
nightmare to solve. Note that the second term in equation 1.1 contains 
a convolution integral. This arises due to the dependence of a semi- 
Harkov state's holding time on the time at which the state was entered 
(local time). The hierarchical approach exploits the Harkov property 
and thus the local time dependence is memoryless. This makes the sol- 
ution to the hierarchical model easier by assuming a single common hold- 
ing time distribution for all exit transitions from a particular 
hierarchical state which allows us to write differential equations for 
the system state probabilities and eliminates the need to explicitly 
convolve. 

In the hierarchical approach subsystem level events are addressed 
through the use of time-varying event rates computed from the compo- 
nent-level Harkov models of the system's particular subsystems.. These 
event rates are imbedded into the hierarchical system model. The 
resulting model is then evaluated over a mission time. Note that the 
subsystem events of interest are application-dependent because they 
depend on the system's redundancy management strategy. For example, one 
of the subsystem event rates might be associated with a particular 
degraded state while another event rate is the subsystem failure rate. 

It is important to note a current restriction on the hierarchical 
approach. At this juncture the issue. of repair has not been addressed 
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in the hierarchical approach. We also note that White, Butler and Lee 
(Ref {9} .{10}. {1 3} ) examine a semi-Markov approach to unreliability eval- 
uation. However, their approach does not address the state prolifer- 
ation problem in that it utilizes component- level events rather than 
subsystem level events as in the hierarchical approach and thus does not 
reduce the state space of the fault-occurrence model. 

1.4 Organization 

The purpose of this thesis is to present a hierarchical approach to 
reliability modeling of fault-tolerant systems made up of fault-tolerant 
'building blocks' (Ref {5}). The technique is 'developed in the context of 
an example architecture. The first section of Chapter Two characterizes 
the development of a hierarchical model. Section 2.2 defines the terms 
to be used in the derivation of the hierarchical approach and defines a 
hierarchical Markov state. Section 2.3 presents the sample architecture 
to which the hierarchical approach is applied. Section 2.4 gives a gen- 
eral form of the solution of a hierarchical model. 

Chapter Three describes the implementation and simulation of the 
hierarchical approach as it is applied to the sample architecture of 
Section 2.3. 
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Chapter Four presents the results of the various test cases imple- 
mented. 

Chapter Five summarizes the approach, the results of this study and 
presents topics for further research. 



CHAPTER 2. A HIERARCHICAL APPROACH TO RELIABILITY MODELING 


2.1 Outline of Hierarchical Model Development 

Hierarchical model development is a four stage process as shown in 
figure 2.1. The process begins with a system definition. For a fault- 
tolerant system, a system definition involves defining the architecture 
and redundancy management strategy. From the system definition we iden- 
tify the set of subsystem level events that are of interest. These 
include events which trigger reconfigurations and the reaching of vari- 
ous performance levels in a subsystem. 

For the identified set of events, a set of component- level event mod- 
els are developed. There will be an event model for each unique event 
defined. From each of these models, an (approximate) event rate can be 
derived. Using the event definitions and event rates, a hierarchical 
model of the system is developed. Each hierarchical state represents 
the system status with respect to the occurrence or non-occurrence of 
the set of subsystem events. The rates of occurrence of these events 
appear as the transition rates among the states of the hierarchical 
model . 
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The hierarchical model state descriptions are unique. We know that a 
full (no state aggregation or model truncation) unique component- level 
model of the system can be developed from the system definition. We 
also know that each state from that model maps uniquely into a hierar- 
chical state. Thus the hierarchical model state descriptions are 
unique. Next we define the terms to be used in our development of the 
hierarchical approach. 

2.2 Definitions 

2.2.1 Definition of Terms 

In deriving the hierarchical approach it is important to define the 
context and relationships between the terms used. Those terms are: com- 
ponent, subsystem, system, reconfiguration rules, component- level mod- 
els, subsystem models, and hierarchical models. We begin with a simple 
example. Then a general form of the solution is presented for a more 
complex example which is used in the remainder of this. work. 

A simple example is used here to introduce the terms used in the 
hierarchical approach. The fault-tolerant system is made up of two 
fault-tolerant subsystems and is shown in figure 2.2. 
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component A 
component A 


component B 


component B 









Each fault-tolerant subsystem is made up of two like components . However 
the two subsystems may have different components. A component is an 
element of a subsystem and often has a constant failure rate. Without a 
loss of generality (see Appendix A) we shall assinne constant component 
failure rates throughout this study. Function migration means that sys- 
tem operation with respect to a particular function moves from one sub- 
system to the other subsystem according to the system reconfiguration 
rules . The system has the reconfiguration scheme given graphically in 
figure 2.3 and uses the following definitions. 
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For the hierarchical approach, 
formed as in figure 2.5. 
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CHMIKKD 

1 = 1.2 

Figure 2.5 

Component -Leve l Subsystem Model 


From this component- level subsystem model, subsystem event rates are 
derived (this process is described in section 2.4). There are generally 
two types of subsystem event rates to be considered. These are: the 
rates at which a subsystem reaches various degraded states and the rate 
at which a subsystem becomes failed. Using such subsystem event rates, 
a hierarchical model of the system is formed (f igure 2.6) . 
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Comparing figures 2.4 and 2.6 we see that the hierarchical model aggre 
gates component- level system model states. 
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Now that the essential elements have been introduced, a complete 
characterization of a hierarchical Harkov state is given. 

2.2.2 Definition of a Hierarchical Harkov State 

The essence of the hierarchical approach lies in the definition of a 
hierarchical Harkov state. A hierarchical state captures the system 
status in terms of subsystem- level events rather than component- level 
events (as in the standard component- level Harkov model approach 
Ref {4}) . Normally, a Harkov model generates the probabi 1 i ty of having 
experienced a sequence of component failures. However, in the proposed 
approach, the hierarchical model generates the probability of having 
experienced a sequence of events which reflect the status of the sys- 
tem's subsystems. This is due to the fact that a hierarchical model 
state is an aggregation of those component -level system model states 
which have the same system status with respect to the pertinent subsys- 
tem-level events. Consequently, a hierarchical state probability is 
(approximately) the summation of the corresponding component- level sys- 
tem model 1 state probabilities. 

In order to derive the exit transition rates for the hierarchical 
states additional information must be associated with each hierarchical 
state. In particular, for each subsystem- level event corresponding to 
an exit transition from a given hierarchical state we must formulate an 
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event model which reflects both the time of entry into that particular 
hierarchical state and the possible states that the subsystem can be in 
whenthat hierarchical state is entered. The solution of the event model 
provides the time evolution of a conditional subsystem state probability 
vector (conditioned on being in a particular hierarchical state) which 
in turn provides the information necessary to derive the corresponding 
exit transition rate. The event models are. readily derived from the 
component- level subsystem models as will be shown in Section 2.4. The 
derivation of the transition rates among hierarchical states will also 
be prescribed in Section 2.4. Presently we will describe the sample 
architecture used to illustrate these steps in the formulation of a 
hierarchical model. This example is also utilized to provide the numer- 
ical results of Chapter 4. 


2x2 Sample Architecture 


The sample architecture is a distributed processing system that con- 
sists of a collection of processing subsystems that are completely 
cross-strapped such that information can be exchanged between any pro- 
cessors. In formulating the hierarchical model in section 2.4, wewi 1 1 
find that only two subsystems are needed in the sample architecture 
(figure 2.7) to fully describe the approach, its implementation, and 
associated problems. The ground rules for this system's operation pre- 
scribe that a quad fault-tolerant processor (FTP) (Ref{14}) is said to 
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be operational if it has suffered no more than three channel (component) 
failures, and is said to be degraded (i.e., loses masking capability) if 
it has had two or more channel failures. A particular system function 
will remain in the initial supporting processor until that processor 
becomes degraded. The system will then migrate the function to a sub- 
system that is not yet degraded (if one is available), otherwise the 
function will remain in that subsystem until a third channel failure 
results in a system loss (see figure 2.8). In this example, the func- 
tion of interest starts in FTP1 and migrates to the other subsystem as 
indicated in the hierarchical model in figure 2.9. Note that in this 
analysis, intercomputer network failures are neglected for the purpose 
of clarity. Note also that failure detection and isolation (FDI) of 
channel failures is assumed to be local to the subsystems. This is a 
necessary condition for the application of the hierarchical approach. 
However, coverage at the system level (i.e., probability of success of 
function migration) although not included in this example, can be readi- 
ly incorporated in the state transitions of the hierarchical model. The 
derivation of the transition rates among the hierarchical states (see 
figure 2.9) is given in section 2.4. 
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For this sample architecture two different event models are required 
for each subsystem. This is because for each subsystem there are two 
distinct subsystem- level events intrinsic to the system definition. As 
previously stated, separate event models are needed to produce the con- 
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ditional subsystem state probability vectors associated with the calcu- 
lation of hierarchical state transitions. These conditional subsystem 
state probability vectors obey the differential equation: 


w(t) = Air(t) + I n u(t) 


Where n(t) is a conditional subsystem state probability vector associ- 
ated with the hierarchical state for which exit transitions are to be 
derived. The u (t) term is decomposed as follows: 





where p^-r) is the conditional probability (conditioned on the reconfig- 
uration rules) that the functioning subsystem is in state i at time t. 
p ( t) is a conditional subsystem state probability vector for the same 
subsystem as n(t) but reflects the possible states the subsystem can be 
in upon entry to the hierarchical state for which exit transition rates 
are to be derived. At this time is necessary to explain the pdf (t) and 
p (t) terms in detai 1 . 

2.4.1 Probability Density Functio n Determination 

The pdf (t) term represents the probability density function of the 
entry time to a particular hierarchical state. An isolated example is 
used to derive the pdf (t) term in equation 2.3 and then the result is 
generalized to the hierarchical model given in figure 2.9. Given the 
simple four state hierarchical model in figure 2.10 (analogous but unre- 
lated to the hierarchical system model given in figure 2.9), we need to 
know the probability density function of the time of transition to a 
particular state. Let the time of entrance to state i be the random 
variable T ia Given that the function of interest starts in state 0 with 
probability one, t 0 ■ 0 with certainty, ie.. 







FIGURE 2. 10 
ISOLATED EXAMPLE 


(a Dirac delta function at t ■ 0) . Using the definitions of expected 
value and mean time to failure (MTTF) (Refill}), the pdf(T,) is derived. 
Defining the state 0 to state 1 transition as a subsystem failure, we 
have 



MTTF 


(2.5) 




where m(-r.,) is the mortality function: 


» X( ^ )R( > 


( 2 . 6 ) 


and, Rl?.,) is the reliability function. 


R( t 1 ) 


-y x(t)dt 


.(t. ) 


( 2 . 7 ) 


(for this example). Thus, 


m( x 1 ) = Mr )**(t ) (2.8) 


and. 




( 2 . 9 ) 
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And using the expected value result for a random variable X: 

e{x} = / Xpdf (X)dX (2.10) 

it can be seen by comparing equation 2.9 and equation 2.10 that: 

pdftr^ = x < T 1 > ir o (T 1 ) (2.11) 

The pdf term is found in the same way for higher subsystem event lev- 
els. although it is not obvious that this i s the case. Beyond the first 
subsystem event level the distribution of times to enter a hierarchical 
state would seem to become more involved computationally. Referring 
back to our isolated example in figure 2.10. the time of transition to 
state i is the random variable t,. Let the holding time in state i be a 
random variable h,, (which depends on t 5 ) . Thus, to find the probabili- 
ty density function of time to enter state n (t 0 ) , an n-2 fold convo- 
lution is necessary. This is due to the fact that contained in the time 
to transition to state n are the holding times in all the states 
upstream of state n (which are all independent random variables). 
Because the holding times in the hierarchical states involve a cascade 
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of entry time dependencies, an exact determination of the density func- 
tion of the time of transition to a hierarchical state at these higher 
event levels is computationally difficult. This difficulty increases 
rapidly with the event level of the hierarchical state in question. 

The Harkov property demands that the time spent in any state be 
"memory less," thus a heavy constraint is imposed upon the distribution 
of times that a process can remain in a given state. To satisfy the 
Harkov property, the holding time in a state of a continuous time proc- 
ess must be exponentially distributed with a parameter that depends only 
upon the state in question. That is, the holding time distribution is 
the solution to a first-order differential equation (Ref {7}) . We have 
noted (see Section 1 .3) that an exact formulation of the hierarchical 
model violates this restriction and is computationally difficult. Thus 
we must now introduce an appox i ma t i on . 

If a hierarchical model is a semi-Harkov process the holding time for 
a transition from some state i to any other state j may have an arbi- 
trary distribution. We are imposing a restriction to force the hierar- 
chical model to obey the Harkov property. That is, in the hierarchical 
model we will assume that the holding time distribution for a transition 
from each state i to any other state j is exponential, satisfies the 
Harkov property and is the same for any exiting transition. We hypothe- 
size that this is a good approximation if all transitions out of each 
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hierarchical state have approximately the same holding time distrib- 
utions. 



Due to the above approximation, the pdf term for an event model 
(equation 2.11) at any event level can easily be included in a differ- 
ence equation which determines a subsystem's conditional state probabil- 
ity vector and does not contain a convolution to reflect the cascade of 
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local time dependencies. To start, we write the event models associated 
with a hierarchical state as differential equations of the form: 


n(t) = Air(t) + pdf (t) p( t) (2.12) 

where A is the system matrix for the model shown in figure 2.11 and p is 
as given in equation 2.2. The pdf term is broken into its parts from 
equation 2.11: 

pdf(t) = l X (t)H.(t) (2.13) 

i 1 


Where H 1 (t) is the probabi 1 i ty of being in a hierarchical state with an 
exit transition into the hierarchical state associated with equation 
2.12 and x 1 (t) is the corresponding transition rate. Substituting 
equation 2.13 into equation 2.12, the exact solution to equation 2.12 
is: 


ir(t) = e At ir(0) + X. ( x)H (t) p( t)dx 

0 i 1 


(2.14) 
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which, when converted to discrete time is: 


i=k-1 

ir(t k ) = * ( V t o )ir(t o ) + ^ *(t Jc ,t j )pdf(t j )p(t j )At (2.15) 


where: 


♦ ( VV ■ e 


A(t-t) 
k 0 


(2.16) 


Equation 2.15 is a convolution sum. However, it can be solved as a dif- 
ference equation, stepping forward in time: 




As mentioned earlier, the p term in the context of our example repres- 
ents the conditional state probability vector for a subsystem when the 
function migrates there. The pdf (t k ) At term represents the probability 
that the function migrates to that subsystem at time t k . Both of these 
quantities are expressed in terms of the same global time argument. 
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Because both of these terms (pdf and p) are formulated in terms of the 
same time argument and are available at each time step, equation 2.17 
can be updated at every time step. The hierarchical transition rates for 
the hierarchical model can then be constructed from n(t k ) as prescribed 
in Section 2.4.4. 

2.4.2 Conditional Probability Derivation 

The term p (t) in u(t) is a conditional subsystem state probability 
vector which reflects both the entry time to the hierarchical state 
associated with n(t) of equation 2.11 and the possible states that the 
subsystem can be in given the transition event and the associated source 
state. For the sample architecture we know that when a function 
migrates to a new processor, that the post-migration processor either in 
the zero channel failure state or the one channel failure state. And, 
given that all subsystems are active at system startup, there will be a 
unique conditional state probability (conditioned on having never 
entered states three of four) vector associated with the time of entry 
into a hierarchical state. 

The Markov Model for a FTP subsystem is the same as that given in 
figure 2.11 and eq. 2.17. Thus, this model will have to be evaluated to 
produce the conditional subsystem state probability vector. The behav- 
ior of the model in figure 2.11 obeys the equation: 
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( 2 . 18 ) 


The conditional subsystem state probabi ltiy vector p (t) is computed by 
evaluating this subsystem state probability vector and normal izing it at 
each time step so that the subsystem can only be in states one and two. 
This is done as follows: 



and. 


7T 2 ( t) 

P 2 (t) = Wj ( t) + ir 2 (t) 


( 2 . 20 ) 




for every time step. This gives the desired p (t) for equation 2.12 and 
does so i n a way that requires little computational overhead. 


2.4.3 Initial Conditions 


The initial conditions for each of the models associated with a 
hierarchical state are determined in a straight forward manner. Since 
system operation begins with no failures, the Harkov subsystem models 
associated with hierarchical state one have the initial condition: 



( 2 . 21 ) 


For subsystem models associated with the other hierarchical states, the 
initial conditions are: 



( 2 . 22 ) 
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2.4.4 Calculation of Subsystem Event Rates 

The subsystem models are formulated as Harkov chains in which states 
are reached via transitions which reflect component level events. A 
subsystem event rate must be computed for each of the events pertinent 
to a particular subsystem. To compute a subsystem event rate, a full 
Markov model of a subsystem (figure 2.11) is reduced to a simple two 
state Markov Model (figure 2.12) containing an operational state, a 
failed or degraded state (depending on the event in question) and a 
time-varying transition rate. 



Figure 2.12 

Transition Rate Example 
Reduced Model 




For the example in figure 2.11, state 4 is the failure state and states 
1-3 are the operational states. The transition rate in figure 2.12 is 
computed as follows: 


X(t) 


J X i4 P{in state i at 
ie {operational states} 


system occupies am 
^ operational state at t 


(2.23) 


So, for this example. 


X(t) 


X 14 V° ^ X 24*2 (t) + X 34»3 (t) 

TI 1 ( t) + Jt 2 ( t) + Itj ( t) 


(2.24) 


however, x 14 and x 24 represent the rates of simultaneous failure events, 
which are negligible. Thus, for the given example. 


X(t) 


*34*3 

1 - ir (t) 

4 


(2.25) 


is the rate of the event of subsystem failure which corresponds to 
x,' (t) and x a ' (t) in Figure 2.6. The subsystem degradation rates x, (t) 
and x 2 (t) of Figure 2.6 are obtained in a similar manner. 
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Another factor in determining the accuracy of a hierarchical model is 
the ability of the derived event rate to accurately model the dynamics 
of the component- level system model states that are aggregated into the 
source state for the rate in question. 



Unreliability predictions for the sample architecture are obtained in 
two ways: a component- 1 eve 1 Markov chain (exact model) and the hierar- 
chical procedure described in Chapter 2. The event models associated 
with the hierarchical approach are also component- level Markov chains 
and because we have assumed constant component failure rates (see Appen- 
dix A) these models and the exact model can be readily solved via a dis- 
crete-time time- invariant state transition matrix (STM) formulation. In 
the case of the hierarchical model time varying transition rates appear 
and consequently this model must be numerically integrated. Both meth- 
ods are described below. 



Since the event and exact models are linear time- invariant Harkov 
chains, a discrete-time time-invariant state transition matrix formu- 
lation can be used to propagate the state probability vectors through 
time. The general system to be solved is: 
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x(t) 


Ax(t) + Bu(t) 


(3.1 ) 


Because the homogeneous solution is a degenerate case of the complete 
solution of the state equation (equation 3.1), that result will be shown 
later. The solution of equation 3.1 is: 


x(t) 


At lnx t A(t-T> , 
e x(0) + J e But t)<3t 


(3.2) 


The terms e At and e A(t ' T) are the STM for the system in equation 3.1. 
Putting equation 3.2 in terms of a STM 4> : 


x(t) - *(t - t 0 )x(t Q ) + / t *(t - t)Bu ( r)di t > t (3.3) 

■0 0 


To obtain a discrete-time representation of equation 3.3 a constant time 
step, At, is needed. Defining 


At 


\ ' Vi 


Jc a 1,2,... 


(3.4) 
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and, 


\ " kAt 


(3.5) 


x 

wi th. 


t Q * (k - 1 ) At k 


(3.6) 


Assuming u (t) is constant over each time step, equation 3.2 becomes: 


* ( V 


A At 

e x(t 


k-1 


) + ( f Q e AT dT)Bu(t ]c _ 1 ) 


(3.7) 


Using the single step STM <t> ss ■ e AAt , equation 3.7 becomes: 

At 

X( V = <t ss ut,x(t k-1 ) + ( fo e dT ^ Bu(t k^ ) <3 ' 8) 
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Given that B is a constant matrix 


x(t ) = * (At)x(t. ) + Bu(t ) At (3.9) 

K 88 1 X “ I 


The single step STM must be put in terms that can easily be computed. 
We know that: 


a At 

* (At) = e (3.10) 

S3 



For At much smaller than the reciprocal of the largest magnitude eigen- 
value of A, the expansion can be truncated after a few terms and still 
retain the desired accuracy (Ref {2}). So for this implementation, the 
single step STH is approximated as: 
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I + A At 


(3.12) 


* (At) = 

S3 


For the purely homogenous cases (i.e. the exact model and event mod- 
els associated with the first hierarchical state), u (t) ■ 0 and the sol- 
ution of the state equation becomes: 


* ( V 


* (At)x(t ) 
ss lc-1 


(3.13) 


For the subsystem models of FTP1 and FTP2 in the sample architecture. 


* (At) 
33 


— 



— 


— 



— 

1 

0 

0 

0 


-4 X 

0 

0 

0 

0 

1 

0 

0 

+ 

4 A 

-3 A 

0 

0 

0 

0 

1 

0 


0 

3 A 

-2 A 

0 

0 

0 

0 

1 


0 

0 

2 A 

0 


(3.14) 


For the exact model, the single step STM is (see Figure 3.1): 
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(3.15a) 


* ( At) = I + A At 

ss 1 1 E 



See next page 


(3.15b) 


The homogenous models are solved simply by stepping forward in time 
using equation 3.13. 


For determining the conditional subsystem state probability vectors 
associated with the hierarchical states at the first and greater event 
levels, the B matrix is the identity matrix and u(t k ) is as given in 
equation 2.3. So again the model is solved simply by stepping forward 
in time using equation 3.9. 


3-..1 .2 Solution of Hierarchical Model 

As formulated, the hierarchical model is simply a first order, homo* 
geneous, linear system of differential equations with time-varying coef- 
ficients. The general form of this system is: 


x(t) = A(t)x(t) 


(3.16) 
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(3.15b) 


Component-Level Exact Model Matrix 
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Figure 3.1 

Component-Level Model 
Two Subsystems 
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This is a simple initial value problem that must be integrated to pro- 
duce the result. 

There are generally two ways in which an initial value problem can be 
solved: a single-step method (Euler method, Runge-Kutta method) and mul- 
tistep methods (Predictor-Corrector methods) (Ref {12}). The choice con- 
sists of a trade-off between accuracy and computational efficiency. The 
method used in this study is a 5 and 6 stage Runge-Kutta method. This 
was chosen due to the moderate relative error requirement based on the 
small number of significant digits (1 or 2 at best) in the input and 
consequently the solution of the truth model. An outline of the Runge- 
Kutta method follows: 

1. Take a step At forward from t using the Euler method. 

2. Evaluate x(t) at this point and use x (t+ At) to adjust 
derivative to be used at t. 

3. Use adjusted slope to take a second step from t. 

4. Evaluate x (t) at this point and further adjust slope 
to be used at t. 

5. Repeat 3 and 4 to the order desired. 

6. Combine all estimates to take actual step to t+At 

It is pointless to rederive the Runge-Kutta formulas here. The reader 
is directed to Rice for the details of the formulas. 
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The actual subroutine used here is DVERK from the IHSL package. It 
was chosen because it solved the hierarchical models to close agreement 
with the exact models. If we loosen our relative error requirement, a 
less computationally expensive method (such as the single step, single 
stage Euler Method) could be used and introduce even greater computa- 
tional savings through the use of the hierarchical approach. 

3.1 .3 Progr.am.flu.t.1. ine 

The hierarchical model and event models are solved simultaneously as 
the event models provide information needed to construct the time-vary- 
ing transition rates for the hierarchical model .The exact model is also 
solved simultaneously so that the hierarchical results can easily be 
compared to the exact model. An outline of the simulation program fol- 
lows: 

• Read initial data 

• Initialize subsystem and hierarchical state probability vectors 

• Initialize exact model state probability vector 

• Assemble event model STMs 

• Assemble exact model STM 

• For t 0 to t ii||jon by At 

• Update exact model 

• Compute hierarchical transition rates from event model results 

• Assemble hierarchical system matrix 
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• Integrate hierarchical system state equations 

• Output time, hierarchical unreliability prediction, exact model 
unreliability prediction 

• Update conditional subsystem state probability vectors 
• Output run statistics 
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CHAPTER 4. RESULTS 


4_J Computational Efficiency 

Probably the most important result of the hierarchical modeling tech- 
nique is the increased computational efficiency compared to the compo- 
nent-level modeling approach. When a large number of subsystems exist 
in a system and many combinations of reconfigurations are of interest, 
the state-space of a component level model' grows rapidly with the number 
and complexity of subsystems. In contrast, the hierarchical approach 
does not add nearly as many states to the system-level model when anoth- 
er subsystem is added. Consequently there is potentially a considerable 
difference in the computational burden associated with each approach. 
To demonstrate this observation we will examine the hierarchical and 
exact models for three systems comprised of different subsystems. The 
three cases examined are: two dissimilar subsystems, two dissimilar sub- 
systems with imperfect coverage at the subsystem level, and three dis- 
similar subsystems. 

We shall count the number of multiplications needed per time step. 
The assumption .is made that, for the purpose of comparing computational 
efficiency, the hierarchical model and the exact model are solved using 
the same numerical technique. Note that in formulating the exact mod- 



els, considerable state reduction was performed via state aggregation 
based upon common exit transition rates. In addition we exploited the 
observation that once a FTP loses two channel*, additional channel fail- 
ures need not be tracked since that FTP is no longer a candidate post- 
migration site; the only exception is the case of the final supporting 
FTP where channel failures must be tracked to the point of total system 
failure. Also note that the operation of multiplying a N-vector by a 
NxN matrix is of order N 2 . 

For the case of two dissimilar subsystems each subsystem model is 
fourth order (see Figure 2.11) and the hierarchical model is fourth 
order. Counting multiplications per time step we have: 


model 

multiplications/ At 

Oth level event 

2x3 2 

1st level event 

2x(4**2) 

hierarchical 

42 

sum 

70 


Table 4.1 
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Notice that the two event models associated with the Oth event level 
have only three states. This follows from the observation that we are 
only interested in the transition between states 2 and 3 in Figure 2.11 
as this transition corresponds to the event that triggers the function 
migration. This observation will be utilized in the other two cases. 
The component- level exact model for this case has the following order of 
execution: 


model 

multiplications/ At 

exact 

1 |2 

sum 

121 


Table <4.2 


The next two cases show, when compared to the first, that a hierar- 
chical model 1 s computational advantage over a component- 1 eve! model of 
the system increases as the subsystem models become more complex and 
also as more subsystems are added to the system. To address the issue 
of more complex subsystem models, we will no longer assume perfect cov- 
erage at the subsystem level. Instead we will include a vulnerable 
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state in the subsystem model to capture near-coincident failures in a 
subsystem which will lead to the loss of that subsystem. Such near 
coincident failures arise when an adjudication is to be made on the 
occurence of a failure among three components. If two components fail 
coincidently, the ability to correctly isolate the failed components may 
be lost. Consequently the vulnerable state is included between states 2 
and 3 of Figure 2.11. This state has two exit transitions one of which 
reflects the automatic failure detection, isolation and reconfiguration 
(FDIR) rate and the other the occurance of a second channel failure 
which yields a subsystem failure. The addition of the vulnerable state 
makes the subsystem models fifth order. Counting multiplications per 
time step in the hierarchical model we have: 


model 

multiplications/ At 

Oth level event 

2x4* 

1st level event 

2x<5 2 *2) 

hierarchical 

42 

sum 

102 


Table 4.3 
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Note that the Oth level event model is reduced in order as before. The 
component- level exact model for this system has the following order of 
execution: 


model 

multipttcattons/At 

exact 

182 

sum 

324 


Table 4.4 


Our third case addresses the inclusion of additional subsystems in the 
system. The system now includes three dissimilar subsystems and once 
again we assume perfect coverage in the subsystems. The event models 
are again fourth order, but the hierarchical model is now eighth order. 
Counting multiplications per time step we have: 
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model 

multiplications/ At 

0th level event 

3x3* 

l st level event 

8x(4 2 *2) 

2nd level event 

3x(4 2 +2) 

hierarchical 

8 2 

sum 

253 


Table 4.5 


The component- 1 eve 1 exact models for this case has 30 states and the 
following order of execution: 


model 

multiplications/At 

exact 

302 

sum 

900 


Table 4.6 


The above results clearly demonstrate the savings in computation when 
the hierarchical approach is utilized. The worst case for the hierar- 
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chical approach from a computational viewpoint occurs when all the sub- 
systems are different. Then, there must be a distinct event model for 
each state transition of the hierarchical model. Even in this case, 
however, the hierarchical approach requires less computation than a com- 
ponent-level model for non-trivial systems. For systems wherein some 
(or all) subsystems are identical the number of event models required is 
less than the number of state transitions appearing in a full (i .e. no 
state reduction performed) hierarchical model. The number of event mod- 
els required is equal to the number of state transitions remaining after 
the order of the full hierarchical model has been reduced by performing 
state aggregation on the basis of common exit transition rates. 



As mentioned previously, the two subsystem architecture- fully 
describes the hierarchical approach and associated problems. Thus, all 
-test cases utilized the models developed for the sample architecture 
described in Chapter 2. Two sets of cases are presented: similar sub- 
systems, and dissimilar subsystems. 



The parameter to be varied in this section is the ratio of component 
MTBF (mean-time-between-fai lure) to mission time. All plots are compar- 
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i sons of the unreliability predictions of the hierarchical and of the 
component- 1 eve 1 approaches for the given system versus time. Figure 4.1 
shows the results for a MTBF to mission time ratio of one. Figure 4.2 
gives the results for a MTBF to mission time ratio of five. Figure 4.3 
is the result for a MTBF to mission time ratio of ten. We see that all 
plots show good agreement between the two approaches. Figure 4.4 shows 

the absolute value of the percentage of relative error between the 

> 

hierarchical model's and exact model's unreliability prediction given in 
figure 4.3. We see that there is indeed an approximation introduced 
utilizing the hierarchical model as the percentage of relative error is 
larger than expected errors due to machine precision. Note that the 
large relative error early in the mission is a result of the reduced 
observability for the first few time steps in the hierarchical model. 
The error plots for the previous two cases demonstrate similar behavior 
and therefore are not included. 


4.2.2 Dissimi lar Subsystems 

In this section, the component MTBF to mission time ratio are differ- 
ent for the two subsystems. In the first case (figure 4.5) subsystem 1 
has an MTBF to mission time ratio of ten. In this case for subsystem 2 
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Figure 4.1 

Identical Subsystems 
component MTBF: mission time = 1.0 
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Figure 4.2 

Identical Subsystems 
component MTBF:mission time = 5.0 
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Figure 4.3 

Identical Subsystems 
component MTBF:mission time = 10.0 
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Figure 4.4 

Relative Error Percentage 
Identical Subsystems 
component MTBF: miss ion time = 10.0 
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this ratio is equal to one. In figure 4.6 subsystem 1 has an MTBF to 
mission time ratio of 10 while this ratio for subsystem 2 is equal to 
.1 . Again the plots show the predictions of system unreliability by the 
hierarchical approach and by the component- level exact model against 
time. 

We see that there is disagreement between the curves in both figure 
4.5 and figure 4.6. Note that the amount of disagreement is larger for 
the smaller difference in component failure rate between the subsystems 
(figure 4.5). Also note that the hierarchical approach is neither con- 
sistently conservative nor consistently optimistic in its prediction of 
system unreliability. Figures 4.7 and 4.8 give the absolute value of 
the percentage of relative error between the hierarchical model's and 
exact model's unreliability predictions given in Figures 4.5 and 4.6 
respectively. Again we see that the approximation introduced using the 
hierarchical model is larger than can be attributed to computer round- 
off error. 

Figure 4.3 has the same mission time as figures 4.5 and 4.6 but has 
identical component failure rates in the subsystems. In this case, 
there is excel 1 ent agreement with the exact model. So, the trend we see 
in figures 4.3, 4.5 and 4.6 is that as the difference in the component 
failure rates between the subsystems is increased, the error in the 
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hierarchical approach increases, and then decreases for the example 


given. 

In Chapter Two we said the validity of the hierarchical approach 
depends on whether all transitions out of a hierarchical state to any 
other hierarchical state are first-order and the same for any transi- 
tion. In light of this, the hierarchical method is expected to give 
very small error for identical subsystems. If the subsystems are iden- 
tical, x 2 (t) equals x 1 (t) in figure 2.9. As the component failure 
rates are varied, our assumption breaks down. However, when one subsys- 
tem's component failure rate is much larger than the other subsystem's, 
the exit transitions from hierarchical state 1 in figure 2.9 become 
essentially a single transition rate. Put another way, a dominant fail- 
ure mode surfaces and the other mode becomes negligible in the computa- 
tion of unreliability by both the hierarchical approach and 
component- level exact model. So we see that the hierarchical approach 
is not always conservative, but the associated error is small for the 
cases studied. 
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Figure 4.5 

Different Subsystems 

subsystem 1: component MTBF:mission time = 10.0 
subsystem 2: component MTBF:mission time = 1.0 


69 





SYSTEM UNRELIABILITY 


O HIERARCHICAL £ COMPONENT -LEVEL 


o 

o 



Figure 4.6 

Different Subsystems 

subsystem 1: component MTBFrmission time = 10.0 
subsystem 2: component MTBF: mission time =0.1 
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Figure 4.8 
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CHAPTER 5. SUMMARY AND CONCLUSIONS 


&J Summary 

The objective of this research has been to extend present reliability 
analysis techniques to conveniently capture sequence dependencies within 
the framework of a structural decomposition approach. The requirement 
has been to do this in a way that is accurate while being both less 
labor intensive and less computationally expensive than combinatorial 
approaches and component- level Harkov models. The motivation for this 
research is a need for less difficult "first cut" analyses of fault-to- 
lerant systems comprised of fault-tolerant "building blocks." 

The methodology is based on defining unique subsystem- level events 
and computing associated event rates for a given system definition. The 
event rates are determined from subsystem models and are imbedded in a 
hierarchical Harkov model of the system. The method is derived as it is 
applied to a specific system definition comprised of two fault-tolerant 
subsystems. 

In formulating a hierarchical model we have made the assumption that 
the Harkov property applies even though we recognize that a hierarchical 
model is truly semi-Harkov as indicated by the unreliability predictions 
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for the case of dissimilar subsystems (see Figures 4.4 and 4.5). 
Encouragingly however, the exercises performed indicate that for the 
given system definition the unreliability prediction of the hierarchical 
model closely approximates the unreliability prediction of an exact com- 
ponent-level Markov model. The predictions for different subsystems 
substantiate the claim of semi-Markov behavior. Thus the results of 
this study show that the hierarchical approach is a viable and useful 
method for fault-tolerant system reliability modeling. With some addi- 
tional investigation the applicability of the approach can potentially 
be broadened. Some additional research topics are discussed below. 

5.2 Topics far Further Research 

As stated previously, the issue of repair at the subsystem level 
should be examined in order to extend the usefulness of the hierarchical 
approach. 

The effects of the hierarchical model's semi-Markov behavior can be 
studied further if three subsystems are considered in the system defi- 
nition. This would produce a hierarchical model which would have a cas- 
cade of local time dependencies for the holding time distributions of a 
hierarchical state. 
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When architectures comprised of different subsystems are addressed, 
error is introduced due to the approximations inherent in the hierarchi- 
cal approach. Although the error in our exercises was acceptably small, 
the approximations were not consistently conservative or optimistic. 
Thus, two problems should be examined. First, given the present hierar- 
chical approach, a measure of the error which does not require a exact 
model should be produced. Second, a methodology to force the error to 
be either conservative or optimistic should be investigated. The asso- 
ciated results would consequently bound the system unreliability. 

The hierarchical modelling technique used in this study should be 
further examined to investigate the impact on state probability pred- 
ictions for subsequent performabi 1 i ty analyses. 

Conclusion 

Although the hierarchical approach has not been investigated com- 
pletely, the concept has proved to be viable and to have several useful 
features with respect to the analysis of large complex systems comprised 
of fault-tolerant building blocks. Through the use of structural decom- 
position, the model formulation for such systems is simplified. Fur- 
thermore due to the high degree of state aggregation intrinsic to the 
hierarchical approach, model solution is less computationally burden- 
some. The computational savings increase, in comparison to a compo- 
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nent-level modelling approach, as the number and complexity of the 
subsystems increases. Additional savings occure when some or all of the 
subsystems are equivalent. 



Throughout this thesis, the component failure rates have been assumed 
to be constant. This is a large restriction and should be examined. It 
is desirable to give the component failure rates an arbitrary distrib- 
ution. The Weibull distribution is commonly used {Ref. (2)}. 

The Weibull distribution can model a constant failure rate or mona- 
tonically increasing or decreasing failure rates. This probability den- 
sity function takes the form: 

f(t) = kmt m ^xpt-kt” 1 ) t 0 ( At • 1 ) 

=0 t < 0 (A1.2) 

where k and m are positive, non-zero real numbers. The failure rate 
corresponding to this distribution is: 

X(t) = kmt m 1 (A1.3) 

From equation A1 .3 we see that if m ■ 1 the Weibull distribution produc- 
es a constant failure rate. 
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A preliminary result indicates that the hierarchical approach models 
system reliability adequetly, and introduces little error with respect 
to the truth model (see figure A1.1) . 



78 



SYSTEM UNRELIABILITY 


ORIGINAL PAGE IS 
OF POOR QUALITY 


0 HIERARCHICAL X COMPONENT -LEVEL 



Figure Al.l 


Time-Varying Component 
Failure Rate Result 


79 



REFERENCES 


1. Weinstein, W.W., R.S. Schabowsky, and E. Gai, Digital Fly-By-Wire 

Flight Control System for a Tilt-Rotor Aircraft: Architecture 

and Reliability Studies , The C. S. Draper Laboratory, Inc., 
Cambridge, MA, July 1983. 

2. VanderVelde, W.E. , B.K. Walker, et al., 16.321 Fault Tolerant 
Control Systems Class Notes , Massachusetts Institute of 
Technology, Spring Semester, 1984. 

3. Shooman, M.L., Probabilistic Reliability: An Engineering 

Approach , McGraw-Hill, New York, New York, 1968. 

4. Siewjorek, D.P., and R.S. Swarz, The Theory and Practice of 
Reliable System Design , Digital Press, Bedford, MA, 1982. 

5. Schabowsky, Jr., R.S., et al. , "Evaluation Methodologies for an 
Advanced Information Processing System," AIAA/IEEE 6th Digital 
Avionics Conference, Baltimore, MD, December 1984. 

6. Luppold, R.H. , E. Gai, and B.K. Walker, "Effects of Redundancy 
Management on Reliability Modelling," Proc. 1984 Automatic 
Control Conference , June 1983, pp. 1763-1770. 

7. Kleinrock, L. , Queueing Systems Volume I: Theory , John Wiley & 

Sons, 1975. 

8. Trivedi, K.S., and R.M. Geist, A Tutorial on the Care III 
Approach to Reliability Modelling , NASA CR-3488, December 1981. 

9. Butler, R.W., The Semi-Markov Unreliability Range Evaluator 
(SURE) Program , NASA Technical Memorandum 86261, July 1984. 

10. White, A.L., An Approximation Formula for a Class of Markov 
Reliability Models , NASA CR-172290, January 1984. 

11. Smith. L.O. , Introduction to Reliability in Design , McGraw-Hill, 
New York, New York, 1976. 

12. Rice, J.R. , Numerical Methods, Software and Analysis , McGraw- 
Hill, New York, New York, 1983. 

13. Lee, L.D., Reliability Bounds for Fault-Tolerant Systems with 
Competing Responses to Component Failures, NASA TP-2409, 1985. 

14. Alger, L.S., and J.H. Lala, "A Real Time Operating System for a 
Nuclear Power Plant Computer," Seminars Power Plant Digital 
Control and Fault-Tolerant Microcomputers, Scottsdale, AZ, April, 
1985. 


80 



